Suffix tree

In computer science, a suffix tree (also called PAT tree or, in an earlier form, position tree) is a compressed trie containing all the suffixes of the given text as their keys and positions in the text as their values. Suffix trees allow particularly fast implementations of many important string operations.

The construction of such a tree for the string $S$ takes time and space linear in the length of $S$ . Once constructed, several operations can be performed quickly, such as locating a substring in $S$ , locating a substring if a certain number of mistakes are allowed, and locating matches for a regular expression pattern. Suffix trees also provided one of the first linear-time solutions for the longest common substring problem.^[2] These speedups come at a cost: storing a string's suffix tree typically requires significantly more space than storing the string itself.

^ Donald E. Knuth; James H. Morris; Vaughan R. Pratt (Jun 1977). "Fast Pattern Matching in Strings" (PDF). SIAM Journal on Computing. 6 (2): 323–350. doi:10.1137/0206024. Here: p.339 bottom.
^ Knuth conjectured in 1970 that the problem could not be solved in linear time.^[1] In 1973, this was refuted by Weiner's suffix-tree algorithm Weiner (1973).

[1] Donald E. Knuth; James H. Morris; Vaughan R. Pratt (Jun 1977). "Fast Pattern Matching in Strings" (PDF). SIAM Journal on Computing. 6 (2): 323–350. doi:10.1137/0206024. Here: p.339 bottom.

[2] Knuth conjectured in 1970 that the problem could not be solved in linear time.^[1] In 1973, this was refuted by Weiner's suffix-tree algorithm Weiner (1973).

[2]

[1]